Dataset on target chemical and bioassay analysis—Exploring contaminants of emerging concern in a low mountain river of central Germany

Chemical pollution of the aquatic environment is nowadays characterised by increasing levels of anthropogenic organic compounds at low concentrations and is recognised as one of the main drivers of the deteriorated ecological state of European waterbodies. To improve the understanding of the impact of chemical pollution in surface waters, a combined approach of chemical and bioanalytical testing is considered necessary for effective ecologically oriented water management. For this dataset, six 25-L water samples were collected at six sampling sites along the Holtemme River in Central Germany using large-volume solid phase extraction. All samples were analysed by targeted high-resolution liquid chromatography–mass spectrometry (LC–MS) and a selected bioanalytical test battery using effect-based methods. These methods included cytotoxicity assessment, several mechanism-specific CALUXⓇ tests to identify endocrine and oxidative stress-related effects and the fish embryo acute toxicity test to investigate (sub)lethal effects in the model species Danio rerio. This approach provided a dataset that offers a longitudinal characterisation of the chemical pollution and ecotoxicological impacts. The combination of chemical analysis and effect-based analysis is valuable for future studies as it will help researchers, risk assessors and authorities to identify hot spots of chemical pollution, monitor environmental quality standards and recommend mitigation strategies.

a b s t r a c t Chemical pollution of the aquatic environment is nowadays characterised by increasing levels of anthropogenic organic compounds at low concentrations and is recognised as one of the main drivers of the deteriorated ecological state of European waterbodies.To improve the understanding of the impact of chemical pollution in surface waters, a combined approach of chemical and bioanalytical testing is considered necessary for effective ecologically oriented water management.For this dataset, six 25-L water samples were collected at six sampling sites along the Holtemme River in Central Germany using large-volume solid phase extraction.All samples were analysed by targeted high-resolution liquid chromatography-mass spectrometry (LC-MS) and a selected bioanalytical test battery using effect-based meth-ods.These methods included cytotoxicity assessment, several mechanism-specific CALUX R tests to identify endocrine and oxidative stress-related effects and the fish embryo acute toxicity test to investigate (sub)lethal effects in the model species Danio rerio .This approach provided a dataset that offers a longitudinal characterisation of the chemical pollution and ecotoxicological impacts.The combination of chemical analysis and effect-based analysis is valuable for future studies as it will help researchers, risk assessors and authorities to identify hot spots of chemical pollution, monitor environmental quality standards and recommend mitigation strategies.
© Samples were taken by large-volume solid phase extraction of 25 L of river water using cartridges with the sorbent CHROMABOND HR-X (Macherey-Nagel).The cartridges were freeze-dried and eluted with ethyl acetate, methanol, methanol containing formic acid and methanol containing ammonia.The extracts were evaporated (Multivapor P-6/Rotavapor R-300, BÜCHI Labortechnik AG) before they were subjected to chemical analysis and bioanalytic assessment.The chemical data were obtained by means of liquid chromatography using a UltiMate 30 0 0 LC system (Thermo Fisher Scientific Inc., Waltham, USA), and high-resolution mass spectrometry equipped with electrospray ionisation (ESI), using an Orbitrap MS (Q Exactive Plus, Thermo Fisher Scientific).After peak detection in MZmine • The presented dataset offers valuable insight into the state of contamination along a river gradient in Central Germany flowing through forested, urban as well as agricultural areas, as it provides a comprehensive overview of compounds of emerging concern, covering pesticides, pharmaceuticals, personal care products and industrial chemicals.• The characterisation of the effect profile using bioanalytical tools adds valuable information to the ecotoxicological impact of surface water contamination.• The combination of target chemical data and effect-based data is of value for future studies as it will help researchers, risk assessors and authorities to identify hot spots of chemical pollution, monitor environmental quality standards and recommend mitigation strategies.

Background
Water pollution in Europe is characterised by an ever-increasing number of anthropogenic organic compounds that are detected at low concentrations.Besides indirect inputs from agriculture, industry and urban surfaces, direct sources like conventional wastewater treatment plants without advanced treatment techniques, hospital effluents and industrial discharges are considered the main pathways for pollutants to enter the aquatic environment [ 2 , 3 ].The consequences of this anthropogenic environmental pollution for biodiversity, ecosystems and their living constituents remain only inadequately comprehended and are often difficult or impossible to assess [ 4 , 5 ].Nevertheless, chemical pollution is considered one of the main drivers of the deteriorated ecological state of European waterbodies [ 6 ] and combining the application of chemical analysis and effect-based (i.e.bioanalytical assays) tools is regarded as an important concept for ecology-orientated water management [ 2 , 7 , 8 ].Here, we present a dataset comprising comprehensive chemical analysis and ecotoxicological effects characterisation in river water samples along a gradient of increasing anthropogenic influence.This dataset can be used to evaluate the state of pollution with organic compounds from anthropogenic sources along the river course using e.g., mixture risk assessment approaches, which could in parts be validated or contradicted with the bioanalytical data.Thus, the dataset can help to identify potential hot spots of pollution and pinpoint possible compounds of biological concern.

Data Description
The dataset presented includes chemical and bioanalytical data for river water samples collected from six individual sampling sites along the Holtemme River catchment ( Fig. 1 ) in Central Germany, using on-site Large-Volume Solid Phase Extraction (LVSPE).
The main dataset comprises five delimiter-separated values files using semicolon as delimiter (csv-format) and one Excel spreadsheet (xlsx-format) that can be downloaded from [ 1 ].The file "LVSPE.csv"contains all chemical data, the file "EBM.csv"includes the processed data from effect-based methods (EBMs) and the files "FET.csv" and "NR.csv" contain the pre-processed data from the fish embryo toxicity test (FET) and neutral red retention assay (NR), respectively.Additionally, we provide the file "Water_parameters.csv", which contains a set of water parameters determined during the sampling as well as the discharge at two gauges along the river during sampling (at sampling sites H1 and H4).Table 1 summarises the content of all reported columns in the csv-files.The Excel spreadsheet file "EBM_raw_data.xlsx"comprises eight individual sheets containing raw fluorescence and luminescence reads for the NR and all Chemically Activated Luciferase gene eXpression (CALUX R ) assays along with the respective pipetting scheme.More details are summarised in Table 2 .The sheets AR_raw, anti-AR_raw, ER_raw, anti-ER_raw, Nrf2_raw, PR_raw and GR_raw contain the raw luminescence reads from the AR-, anti-AR-, ER-, anti-ER-, Nrf2-, PR-and GR-CALUX R , respectively.Column A and B contain the plate number and the well row of each plate, respectively.The columns C to L contain the relative luminescence reads, columns M to V the respective pipetting scheme.

Sampling sites, sampling and sample preparation
The Holtemme River is a low mountain stream located in Central Germany that springs in the Harz/Saxony-Anhalt Nature Park and flows into the Bode River as a left tributary after approx.47 km.The Holtemme thus belongs to the Elbe/Saale/Bode river system and has a catchment area of ca.282 km 2 .It represents a watercourse that has only been marginally altered by human influences in its upper reaches, but is increasingly subjected to hydromorphological changes and increasing anthropogenic stressors in its downstream reaches ( Fig. 1 ).The variety of chemical pollution sources ranges from diffuse sources, e.g. from agricultural land or urban areas, to temporary point sources, such as rainwater retention basins, to continuous point sources, such as the two wastewater treatment plants Silstedt (80,0 0 0 population equivalents; commissioned: 1996) and Halberstadt (60,0 0 0 population equivalents; commissioned before 1990, last expansion: 20 0 0) [ 9 ].
Six priority sampling sites along the river course (H1, H2, H3, H4, H5 and H6) were chosen to cover the various anthropogenic impact types.The reference site (H1) was located in the headwaters of the river and reflects negligible human influence on the waterbody and riparian zone.Within the City of Wernigerode, the sampling site H2 was selected as urban impact site.Positioned after the outlet of the Silstedt wastewater treatment plant (WWTP), H3 was characterising a site affected by WWTP effluent.Surrounded by arable land, H4 represented a site where impacts of agricultural land use were likely to occur.A second urban impact site (H5), located after the river passed the City of Halberstadt, and a second WWTP affected site (H6) after receiving effluent from the WWTP Halberstadt, were selected.The sampling sites H1 -H5 were between 6 and 8 km distanced from each other, while H6 was approximately 400 m downstream of H5 ( Fig. 1 ).
In September 2021, 25 L of water were sampled over the course of three hours at each site, using LVSPE [ 10 ].At each sampling location, 50 aliquots of 500 mL water were collected from the middle of the stream at a depth of 20-30 cm.The water was pumped to the LVSPE machine through a PTFE tubing ( ≤ 6 m) equipped with a custom-made stainless steel mesh filter (mesh size: 0.85 mm) to prevent solid matter from reaching the machine.In the machine, the water was pumped through a prefilter (Sartopure GF + MidiCap, 0.65 μm pore size, Sartorius) into a borosilicate glass dosing system (volume: 500 mL) and subsequently through a conditioned (LC-MS grade ethyl acetate, methanol and water) custom-made cartridge, filled with 10 g of sorbent (CHROMABOND HR-X, Macherey-Nagel, Düren, Germany).For a trip blank control, an additional cartridge was carried during the entire sampling campaign at the same conditions as the sample cartridges.
Upon arrival in the laboratory, the cartridges were blown dry with nitrogen gas and subsequently freeze-dried before elution.Elution was done using 100 mL of ethyl acetate, 100 mL of methanol, 100 mL of methanol containing 1 % (v/v) formic acid and 100 mL of methanol with 2 % (v/v) 7 N ammonia.The extracts were evaporated (Multivapor P-6/Rotavapor R-300, BÜCHI Labortechnik AG, Flawil, Switzerland) to reach a final relative enrichment factor (REF) of 40,0 0 0, the solvent was exchanged to DMSO (LC-MS grade) and extracts were stored at −20 °C until further use.

Liquid chromatography-HRMS analysis
The extracts were reconstituted in LC-MS grade methanol at a REF 10 0 0. For analysis, LVSPE samples or the trip blank sample (REF 10 0 0) were spiked with an internal isotope-labelled standard mixture.Aliquots of spiked LVSPE samples and the trip blank were injected into an Ulti-Mate 30 0 0 LC system (Thermo Fisher Scientific), using a reversed-phase column for separation.Mass spectrometry data were obtained by electrospray ionisation (ESI) in positive and negative mode, using a Q Exactive Plus quadrupole Orbitrap (Thermo Fisher Scientific).Matrix-matched calibration was used.For calibration, pristine river water (Wormsgraben, upper Harz mountains, Saxony-Anhalt Nature Park) spiked with target compounds was extracted by laboratory-scale solid phase extraction method with the same sorbent and elution method as detailed above.Data evaluation was done as described in Finckh et al. [ 11 ].Briefly, after peak detection in MZmine 2.38, the peak list was exported as csv-file and blank correction, calibration as well as internal standard quantification of target compounds was performed in the R package {MZquant} [ 12 ].The respective method detection limits (MDLs) were calculated as outlined by the US EPA methodology [ 13 ], based on the standard deviation of replicated analysis of five matrix-matched calibration standards followed by a t -test.

Neutral red retention assay
The neutral red retention (NR) assay was used to assess the cytotoxicity of the sample extracts to human osteosarcoma cell line U2-OS as described in Repetto et al. [ 14 ].In triplicates, the cells (density of 10 0,0 0 0 cells/mL) were exposed for 24 h (37 °C, 97 % humidity, 5 % CO 2 ) with a dilution series of the extracts, a solvent control (0.1 % DMSO), a negative control (cell medium: DMEM/F-12 without phenol red, supplemented with stripped foetal bovine serum and minimal essential medium) and a positive control (sodium lauryl sulphate; 150 μg/mL).Subsequently, the exposure medium was discarded, neutral red solution (33.33 μg/mL) was added and the cells were incubated at the above-mentioned conditions.After 2 h, the cells were washed twice with phosphate-buffered saline, and an acetic acid/ethanol-solution was added and shaken for 20 min.Finally, the NR was measured by fluorescence reading (excitation: 530 nm, emission: 645 nm), using the Tecan Spark R (Tecan Trading AG, Männedorf, Switzerland).The raw data from each replicate along with the respective pipetting scheme is provided in the file "EBM_raw_data.xlsx" in the sheet "NR_raw".The cytotoxicity was calculated from blank corrected fluorescence readings in relation to the solvent control and is provided in the file "NR.csv".Moreover, the 20 % inhibitory concentration (IC 20 ) specified in the file "EBM.csv" was estimated in Prism 9 (GraphPad Software, Boston, MA, USA) with a two-parametric non-linear logistic regression.

Chemically activated luciferase gene eXpression assay
The CALUX R test system, a set of mechanism-specific in vitro reporter gene assays, was used to evaluate the potential for endocrine disruption and oxidative stress-related effects.These assays are based on the human osteosarcoma cell line U2-OS, transfected with a firefly luciferase gene and coupled to a responsive element of interest such as the oestrogen receptor alpha (ER α), the androgen receptor (AR), the glucocorticoid receptor (GR), the progesterone receptor (PR) and the transcription factor Nrf2 (Nrf2).Activation of the responsive element leads to the production of associated proteins as well as luciferase, which induces light emission when luciferin is added as a substrate, allowing the assay to determine estrogenic, androgenic, glucocorticoid and progesterone effects and the potential to induce oxidative stress by luminescence measurement.
The agonistic (ER α-, AR-, GR-and PR-CALUX R ), antagonistic (anti-ER α-, anti-AR-CALUX R ) and Nrf2-CALUX R assays were performed as described in van der Linde et al. [ 15 , 16 ].In short, the corresponding cells (density of 10 0,0 0 0 cells/mL) were seeded in 96-well plates and incubated for 24 h (37 °C, 97 % humidity, 5 % CO 2 ) in assay medium (DMEM/F-12 without phenol red, supplemented with stripped foetal bovine serum and minimal essential medium).The medium was then removed and replaced with exposure medium containing a non-cytotoxic dilution series of the samples and a dilution series of the respective reference substances (17 β-oestradiol for ER α, dihydrotestosterone for AR, dexamethasone for GR, medroxyprogesterone acetate for PR, tamoxifen for anti-ER α, flutamide for anti-AR and curcumin for Nrf2).
For antagonistic assays, the exposure medium was additionally spiked with the reference compound of the agonistic assay at a concentration corresponding to the EC 50 .After incubation under the above-mentioned conditions for a further 24 h, the exposure medium was removed, the cells were lysed using lysis buffer (25 mM TRIS, 2 mM 1,4-dithiothreitol, 2 mM 1,2diaminocyclohexanetetraacetic acid disodium salt, 10 % glycerol, 1 % Triton X R -100), luciferin substrate solution (20 mM tricine, 1.07 mM (MgCO 3 ) 4 Mg(OH) 2 •5H 2 O, 2.67 mM MgSO 4 •7H 2 O, 0.1 mM EDTA, 1.5 mM 1,4-dithiothreitol, 539 μM d-Luciferin, 5.49 mM ATP) was added and the resulting luminescence was quantified using a Tecan Spark R multimode reader (Tecan Trading AG, Männedorf, Switzerland).The determination of bioanalytical equivalents and limits of quantification from luminescence readings were performed in Excel (Microsoft Corporation, Redmond, Washington, USA) using macro-enabled templates provided by BioDetection Systems BV, following the calculations described in van der Linde et al. [ 15 , 16 ] and references therein.Briefly, the bioanalytical equivalents were calculated, based on the relative effect potency, comparing the effect-concentration relationship of the standard with that of the samples.The limits of quantification were determined based on replicated measurements of the solvent blank and calculated from the mean relative luminescence units and 10 times the standard deviation interpolated to the calibration curve of the respective standard dilution series.The validity of each replicate was evaluated based on the goodness of fit of the respective standard curve, the EC 50 of the respective standard and the induction factor between the highest and lowest luminescence reading of the standard curve.The raw data for each bioassay and each replicate along with the respective pipetting scheme can be found in the online repository [ 1 ] in the file "EBM_raw_data.xlsx".The processed data is provided in the file "EBM.csv".

Fish embryo acute toxicity test
Fertilised eggs of zebrafish ( Danio rerio ) were obtained for testing by mass spawning from the in-house zebrafish facility and collected within 2 h of spawning onset.Adult zebrafish (6-18 months) of a wild-type strain were used for rearing and groups of approximately 160 fish per tank (160 L) were reared in a flow-through system.Water quality was maintained by biofiltration followed by UV sterilisation in accordance with OECD Guideline 236.The water quality was In accordance with OECD Guideline 236 [ 17 ], the prolonged fish embryo acute toxicity test (FET) up to 120 h post fertilisation (hpf) was performed in three independent replicates to evaluate the teratogenic potential of individual river water extracts.Briefly, 20 embryos per concentration and replicate were statically exposed to a dilution series of the extracts shortly after fertilisation in 96-well plates and incubated at 26 ± 1 °C with a 14:10 h light:dark cycle.Lethal and sublethal effects ( Table 3 ) as well as successful hatching of embryos were recorded every 24 h.All experiments were terminated by euthanasia of larvae shortly before 120 hpf, as zebrafish larvae younger than 120 hpf are not protected animal stages according to EU Directive 2010/63/EU [ 18 ].
From the recorded effects matching the lethal criteria ( Table 3 ), the mortality percentages were calculated for each replicate relative to the total number of embryos tested.The same was done for all combined effects (i.e.meeting lethal and/or sublethal criteria) to calculate the percentage effects.Additionally, hatching success was determined from the total number of hatched larvae in relation to the total number of larvae tested.Following the validity criteria in OECD Guideline 236, the validity of each replicate was checked and only valid replicates were used for further analysis.The data for the timepoint 119 hpf is summarised in the file "FET.csv".Furthermore, using a two-parametric non-linear logistic regression in Prism 9 (GraphPad Software, Boston, MA, USA), lethal and effect concentrations (LCs and ECs) for 5, 10, 20 and 50 % of the tested population were computed and are provided in the file "EBM.csv".

Limitations
Due to insufficient sample volumes, only two valid replicates could be completed for the Nrf2-CALUX R and one valid replicate each for the GR and PR-CALUX R assays.However, these limitations can be considered relatively minor as the data obtained for the GR-and PR-assays showed no detectable effects and can therefore be considered as a first no-effect screening.For the Nrf2 data, two replicates can be considered a sufficient basis for effect determination, although a third replicate would be favourable.Furthermore, no Nrf2 data were obtained for the trip blank sample, so an oxidative stress-related effect of the blank cannot be excluded.As the chemical analysis was conducted on a target list, the data might not sufficiently represent all possible compounds that exert an adverse effect on organisms inhabiting the aquatic environment.Furthermore, only a limited list of water parameters was determined.These limitations are in principle important to consider when evaluating the dataset.
2024 The Author(s).Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

1. Value of the Data
2.38, blank correction, calibration, and quantification were performed in the R package {MZquant}.The target list included compounds from various chemical groups such as pesticides, biocides, personal care products, pharmaceuticals, industrials and natural compounds.The method detection limits were calculated as outlined by US EPA methodologies.The cytotoxicity of each sample was analysed by means of the neutral red retention assay using a Tecan Spark R multimode reader (Tecan Trading AG, Männedorf, Switzerland) to quantify the fluorescence of neutral red.Bioanalytical data were obtained by means of the CALUX R (Chemically Activated Luciferase gene eXpression) test system (BioDetection Systems BV, Amsterdam, The Netherlands), using a Tecan Spark R multimode reader (Tecan Trading AG, Männedorf, Switzerland) to quantify luminescence.Calculations of bioanalytical equivalent values as well as limits of quantification were performed in Excel (Microsoft Corporation, Redmond, Washington, USA), using macro-enabled templates provided by BioDetection Systems BV.The data were collected from H1 (51 °49 01.1 N, 10 °43 26.6 E), H2 (51 °50 59.4 N, 10 °47 49.4 E), H3 (51 °52 04.4 N, 10 °52 24.8 E), H4 (51 °53 06.1 N, 10 °57 47.1 E), H5

Table 1
Description of the columns included in the presented dataset.

Table 2
Description of the Excel spreadsheet containing the raw fluorescence and luminescence reads for the respective bioassays along with pipetting scheme used.

Table 3
Lethal and sublethal criteria scored during the fish embryo toxicity test.Landau, Germany) as well as biweekly measurements of oxygen, water hardness, nitrite, nitrate and ammonia using commercial quick tests (JBL GmbH & Co. KG, Neuhofen, Germany and sera GmbH, Heinsberg, Germany).In addition, a weekly water exchange of approximately 40 % was achieved by automated addition of reverse osmosis-purified tap water reconstituted with Red Sea salt (Red Sea Deutschland, Düsseldorf, Germany) and sodium bicarbonate (food grade).The water temperature was maintained at 26 ± 2 °C and the lighting was adjusted to a light:dark rhythm of 14:10 h.The fish were fed daily with SDS small granular fish food (SDS, Witham, England) and live brine shrimp nauplii Artemia sp .(Great Salt Lake Artemia cooperative through ZebCare, Nederweert, The Netherlands).